Extending Consensus Clustering to Explore Multiple Clustering Views
نویسندگان
چکیده
Consensus clustering has emerged as an important extension of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings. There is a significant drawback in generating a single consensus clustering since different input clusterings could differ significantly. In this paper, we develop a new framework, called Multiple Consensus Clustering (MCC), to explore multiple clustering views of a given dataset from a set of input clusterings. Instead of generating a single consensus, MCC organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions. A dynamic programming algorithm is proposed to obtain a flat partition from the hierarchical tree using the modularity measure. Multiple consensuses are finally obtained by applying consensus clustering algorithms to each cluster of the partition. Extensive experimental results on 11 real world data sets and a case study on a Protein-Protein Interaction (PPI) data set demonstrate the effectiveness of our proposed method.
منابع مشابه
Consensus Clustering + Meta Clustering = Multiple Consensus Clustering
Consensus clustering and meta clustering are two important extensions of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings, and meta clustering aims to group similar input clusterings together so that users only need to examine a smal...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملMultiple Clustering Views from Multiple Uncertain Experts
Expert input can improve clustering performance. In today’s collaborative environment, the availability of crowdsourced multiple expert input is becoming common. Given multiple experts’ inputs, most existing approaches can only discover one clustering structure. However, data is multi-faceted by nature and can be clustered in different ways (also known as views). In an exploratory analysis prob...
متن کاملHybrid Hierarchical Clustering: Forming a Tree From Multiple Views
We propose an algorithm for forming a hierarchical clustering when multiple views of the data are available. Different views of the data may have different underlying distance measures which suggest different clusterings. In such cases, combining the views to get a good clustering of the data becomes a challenging task. We allow these different underlying distance measures to be arbitrary Bregm...
متن کاملMulti-View Clustering via Joint Nonnegative Matrix Factorization
Many real-world datasets are comprised of different representations or views which often provide information complementary to each other. To integrate information from multiple views in the unsupervised setting, multiview clustering algorithms have been developed to cluster multiple views simultaneously to derive a solution which uncovers the common latent structure shared by multiple views. In...
متن کامل